Topic Models for Dynamic Translation Model Adaptation

نویسندگان

  • Vladimir Eidelman
  • Jordan L. Boyd-Graber
  • Philip Resnik
چکیده

We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts, where topics are induced in an unsupervised way using topic models; this can be thought of as inducing subcorpora for adaptation without any human annotation. We use these topic distributions to compute topic-dependent lexical weighting probabilities and directly incorporate them into our translation model as features. Conditioning lexical probabilities on the topic biases translations toward topicrelevant output, resulting in significant improvements of up to 1 BLEU and 3 TER on Chinese to English translation over a strong baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic Models for Translation Domain Adaptation

Topic models have been successfully applied in domain adaptation for translation models. However, previous works applied topic models only on source side and ignored the relations between source and target languages in machine translation. This paper corrects this omission by learning models that can also use targetside information to discover more distinct topics: tree-based topic models and p...

متن کامل

Rapid Unsupervised Topic Adaptation – a Latent Semantic Approach

In open-domain language exploitation applications, a wide variety of topics with swift topic shifts has to be captured. Consequently, it is crucial to rapidly adapt all language components of a spoken language system. This thesis addresses unsupervised topic adaptation in both monolingual and crosslingual settings. For automatic speech recognition we rapidly adapt a language model on a source l...

متن کامل

Polylingual Tree-Based Topic Models for Translation Domain Adaptation

Topic models, an unsupervised technique for inferring translation domains improve machine translation quality. However, previous work uses only the source language and completely ignores the target language, which can disambiguate domains. We propose new polylingual tree-based topic models to extract domain knowledge that considers both source and target languages and derive three different inf...

متن کامل

Dynamic Topic Adaptation for Phrase-based MT

Translating text from diverse sources poses a challenge to current machine translation systems which are rarely adapted to structure beyond corpus level. We explore topic adaptation on a diverse data set and present a new bilingual variant of Latent Dirichlet Allocation to compute topic-adapted, probabilistic phrase translation features. We dynamically infer document-specific translation probab...

متن کامل

Measuring a Dynamic Efficiency Based on MONLP Model under DEA Control

Data envelopment analysis (DEA) is a common technique in measuring the relative efficiency of a set of decision making units (DMUs) with multiple inputs and multiple outputs. ‎‎Standard DEA models are ‎‎quite limited models‎, ‎in the sense that they do not consider a DMU ‎‎at different times‎. ‎To resolve this problem‎, ‎DEA models with dynamic ‎‎structures have been proposed‎.‎In a recent pape...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012